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Abstract. Given a probability distribution on an open book (a metric space ob- 
tained by gluing a disjoint union of copies of a half-space along their boundary hyper- 
planes), we define a precise concept of when the Frechet mean (barycenter) is sticky. 
This non-classical phenomenon is quantified by a law of large numbers (LLN) stating 
that the empirical mean eventually almost surely lies on the (codimension 1 and hence 
measure 0) spine that is the glued hyperplane, and a central limit theorem (CLT) 
stating that the limiting distribution is Gaussian and supported on the spine. We also 
state versions of the LLN and CLT for the cases where the mean is nonsticky (that 
is, not lying on the spine) and partly sticky (that is, on the spine but not sticky). 



Introduction 

The mean of a finite set of points in Euclidean space moves slightly when one of the 
points is perturbed. This motion is pervasive in classical probabilistic and statistical 
situations. In geometric contexts, the barycenter (Frechet mean, L^-minimizer, least 
squares approximation), which minimizes the sum of the square distances to the given 
points, generalizes the notion of mean. Intuitively, the barycenter of a well-behaved 
probability distribution on a space M of dimension d + 1 ought to avoid lying on any 
particular subspace of dimension d or less, if the distribution is generic. While this 
intuition has been shown rigorous when M is a manifold |Jup88 , IHL961 IBP051 IHucllj , 



it can fail when M has certain types of singularities, as we demonstrate here for an 
open book O: a space obtained by gluing disjoint copies of a half-space along their 
boundary hyperplanes; see Section [T] for precise definitions. 

Example. The simplest singular space is the 3-spider: a union Ts of three rays with 
their endpoints glued at a point (Figure [H left). This space Ts is the open book O 
of dimension 1 with three leaves. If three points are chosen equidistant from on 
the different rays, then the barycenter lies at by symmetry (Figured], center). The 
unexpected phenomenon is that wiggling one or more of the points has no effect on the 
barycenter (Figure [1], right). For instance, if the points lie at radius r from 0, then the 
barycenter remains at upon moving one of the points to radius at most 2r. 
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73 b = barycenter of b stable under 

three points perturbation 



Figure 1 . (left) The space of rooted phylogenetic trees with three leaves 
and fixed pendant edge lengths; (center) the probability distribution sup- 
ported on three points in Ts equidistant from the vertex has bary- 
center 0; (right) perturbing the distribution — and even macroscopically 
moving all three points a limited distance — leaves the barycenter fixed. 

Our main goal is to define a precise concept of when a distribution on an open book 
has a sticky mean, and to quantify this highly non-classical condition with a law of large 
numbers (LLN) in Theorem 14.31 and a central limit theorem (CLT) in Theorem 15.71 

Roughly speaking, the sticky LLN says that in certain situations, empirical (sample) 
means almost surely eventually lie on the spine: the hyperplane shared by all of the 
glued half-spaces by virtue of the gluing (in Figure [H the spine is the point 0). This 
phenomenon contrasts with the classical LLN, where the empirical mean approaches 
the theoretical mean from all directions. The sticky CLT says that the limiting dis- 
tribution is Gaussian and supported on the spine. Again, the non-classical nature of 
this result contrasts with the classical CLT, in which the limiting distribution has full 
support rather than being supported on a thin (positive codimension and hence mea- 
sure zero) subset of the sample space. Versions of the LLN and CLT are also stated in 
Theorems 14. 3[ 15. 7[ and 15.111 for the cases where the mean is 

• nonsticky — not lying on the spine — so the LLN and CLT behave classically; and 

• partly sticky — on the spine but not sticky — so the LLN and CLT are hybrids of 
the sticky and nonsticky ones. 

Open books are the simplest singular topologically stratified spaces. Roughly speak- 
ing, topologically stratified spaces decompose as finite disjoint unions of manifolds 
(strata) in such a way that the singularities of the total space are constant along each 
stratum (this is the structure described in |GM88t Section 1.4]). Every topologically 
stratified space that is singular along a stratum of codimension 1 is, by definition of 
topological stratification, locally homeomorphic to an open book along that stratum. 
Therefore, even if the goal is to sample from arbitrary stratified spaces possessing 
singularities in maximal dimension, it is first necessary to understand open books. 

Sticky means on open books seem to stem from topological phenomena, rather than 
geometric ones. Therefore, although the topological space O can be endowed with 
many metrics, we consider only the simplest, in which each half-space has the Euclidean 
metric and the boundaries are glued isometrically. More general metrics on open books 
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and more general singularities on stratified spaces form part of the wider program of 
probability on stratified spaces, initiated here, whose overarching goals are to 

1. understand how asymptotics of sampling relates to topology and geometry of 
singularities, and 

2. develop quantitative computational statistical methods for handling data sam- 
pled from stratified spaces. 

Apropos the second goal, this paper is output from a Working Group, on sampling 
from stratified spaces, that ran under the Statistical and Applied Mathematical Sci- 
ences Institute (SAMSI) 2010-2011 program on Analysis of Object Data. The group, 
in which most of the authors were members, considered practical applied questions from 
evolutionary biology, medical imaging, and shape analysis. Key examples of stratified 
spaces considered by the group therefore included 

• shape spaces, representing equivalence classes of point configurations under oper- 
ations such as rotation, translation, scaling, projective transformations, or other 
non-linear transformations (for example, see |DM98l IPM031 IPLSlOj for direct 
similarities, affine transformations, and projective transformations, respectively); 

• spaces of covariance matrices, arising as data points in diffusion tensor imaging 
(see |AFPA06l IBaP96l ISchOSl ISMTOSj . for example); and 

• tree spaces, representing metric phylogenetic trees on fixed sets of taxa (see 
IBHVOll [QPTTl IMUPllj . for example). 

For instance, the space Ta from the Example parametrizes all rooted (metric) phylo- 
genetic trees with three taxa and fixed pendant edge lengths. More generally, open 
books of arbitrary dimension and precisely three leaves reflect the local structure of 
phylogenetic tree space nearby any point on a stratum of codimension 1; such a point 
represents a tree possessing a node with non-binary branching. Our observations of 
"unresolved" (that is, non-binary) trees as barycenters of biologially meaningful sam- 
ples (see |MOPll[ Examples 5.10 and 5.11] for descriptions of cases involving yeast 
phylogenies and brain arteries) constituted crucial motivation for the present study. 

The relation between open books and tree spaces is that of local to global. After 
completing a draft of this paper, we found that Basrak |BaslOj had independently 
and simultaneously proved a sticky CLT for certain global situations in dimension 1, 
namely arbitrary binary trees: connected graphs with no cycles where each node is 
incident to at most three edges. In contrast, our dimension 1 results are local, in that 
all edges meet, but there can be more than three incident to the intersection. 

It bears mentioning that unlike in open books, barycenters do not stick to thin 
subspaces of shape spaces, or to thin subspaces of more general quotients of manifolds 
by isometric proper actions of Lie groups [Hucl2] . The differentiating property amounts 
to curvature: open books are, in a precise sense, negatively curved, whereas passing 
to the quotient in the construction of shape spaces adds positive curvature. Basrak's 
binary trees |BaslOj are negatively curved in the same way that open books or spaces 
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spaces of trees are |BHV01] : they are globally nonpositively curved (or CAT(O) spaces). 
It is a principal long-term goal of our investigations to tease out the connection between 
stickiness of means of probability distributions with values in metric spaces and notions 
of negative curvature. 

Finally, modern applications of statistics require knowledge of the asymptotics of dis- 
tributions on singular spaces, including algebraic varieties and polyhedral complexes 
such as shape spaces and tree spaces. The natural generality seems to be topologically 
stratified spaces, or the more restrictive class of Whitney stratified spaces |GM88t Sec- 
tion 1.2], which seems to contain all of the spaces in statistical applications (including 
all real semialgebraic varieties). However, for the purposes of data analysis, a central 
limit theorem is only a vehicle toward parameter estimation. Further steps would re- 
quire the derivation of Slutsky-type theorems for random objects on open books and 
more general stratified spaces. Eventually, the goal lies in generalizing the rich sta- 
tistical toolkit to stratified spaces. For instance, to date there are generalizations of 
the powerful techniques of regression to Riemannian manifolds |SSLI"'"09] . as well as of 
principal component analysis (PGA) and multivariate analysis of variance (MANOVA) 
to shape spaces |HHM10al EHMlObj . 
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1. Open books 

Set 5* = M'^, the real vector space of dimension d with the standard Euclidean metric. 
If M>o = [0, oo) is the closed nonnegative ray in the real line, then the closed half-space 

H+ = R>o X S 

is a metric subspace of M'^^^ = M x S* with boundary S which we identify with H = 
{0} X S, and interior = M>o ^ S. The open book O is the quotient of the disjoint 
union Hj^ x {1, . . . , of K closed half-spaces modulo the equivalence relation that 
identifies their boundaries. Therefore p = {x,k) = { ^ \ k) is identified 

with q = {y,j) = {y^'^\ y^^\ . . . , y^'^\ j) whenever x^^'' = and x*-*^ = for all i G 
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{0, . . . ,d}, regardless of k and j. The following definition summarizes and introduces 
terminology. 

Definition 1.1 (Leaves and spine). The open book O consists of i^T > 3 leaves Lj., for 
k = 1, . . . , K, each of dimension d + 1 and defined by 

Lk = H+x {k}. 

The leaves are joined together along the spine Lq wich comprises the equivalence classes 
in [j^^i{H X {k}), i.e. Lq can be identified with the hyperplane H = {0} x S* or with 
the space S = M.'^. When we speak of the spine in the following, we make clear which 
of these three instances of the spine we have in mind. The following diagram gives an 
overview of these instances, spaces and mappings introduced further below. 

C O 

Ps = ns°Fk 



Example 1.2. A piece of the open book with d = 1 and K = 5 is depicted below: 




Ideally, the picture of this embedding would continue to infinity vertically, both up and 
down as well as away from the spine on every leaf. 

Definition 1.3 (Reflection). For a given point x E let Rx E H ^ = R<o x M'' = 

(— oo, 0] X M.'^ denote its reflection across the hyperplane H_^. fl H_ = {0} x S. 

The metric d on O is expressed in terms of reflection in a natural way: given two 
points p,q E O, with p = {x, k) and q = {y,j), 

''(^•')={itlii 

where — 1/| denotes Euclidean distance on M'^"'"^. Note that if k ^ j in Eq. (11. ip . then 
d{p, q) = if and only if x and y lie on the spine and coincide. Our assumption > 3 
implies that O is not isometric to a subset of (as it would be for K < 2). 



i Fk 'I t Fklio 

5 = 
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The next lemma refers to globally nonpositive curvature. See |Stu03] for a definition 
and background. The only times we apply this concept here are in noting the unique- 
ness of barycenters in our context (see Definition 13.11 and the line following it) and to 
obtain a quick proof of a Strong Law of Large Numbers (Lemma 14.21) . 

Lemma 1.4. The open book {0,d) is a Hausdorff metric space that is globally non- 
positively curved, and its spine is isometric to M'^. 

Proof. IStuOai Example 3.3]. □ 

Lemma 1.5. The open book O is the disjoint union 

C = Lo U L+ U ■ ■ ■ U L+ 

of the spine Lq and the interiors = \ Lq of the leaves, k = 1, . . . , K . 

Remark 1.6. Although the open book O is not a vector space over M, scaling by a 
positive constant A G ]R>o is defined in the natural way: 

Xp = (Ax, k) for all p = (x, k) G O. 

The open book also carries an action of the spine S, considered as an additive group, 
by translation, via the action of S on each leaf: 

O 3p = {x^^\x^^\...,x^''\k) 4 (x(°\xW + + z('^\A;) G C, 

with z = {z^^\ . . . , z^'^^) G S. For the above right-hand side we write simply z + p. 

2. Probability measures on the open book 

Our goal is to understand the statistical behavior of points sampled randomly from O. 
Suppose that /i is a Borel probability measure on O. We assume throughout the paper 
that d{0, q) has bounded expectation under the measure //: 

(2.2) / d{0,q)dfi{q) < oo. 

Jo 

When explicitly stated, we also assume the stronger condition 

(2.3) / d{Q,qfdM < oo, 

Jo 

of square integrability. 

Lemma 2.1. Any Borel probability measure n on the open book O decomposes uniquely 
as a weighted sum of Borel probability measures Hk on the open leaves L'^ and a Borel 
probability measure /iq on the spine Lq. More precisely, there are nonnegative real 
numbers {wkYk=o summing to 1 such that, for any Borel set ACQ, the measure /i 
takes the value 

K 

/i(A) = woiio{A n Lo) + J] Wklik{A n L+). 

fc=i 
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Proof. By Lemma [T75l the spine and open leaves are measurable and partition O; hence 
the result follows from the additivity of measures on disjoint sets. □ 

Remark 2.2. For k > 1, = fJ'{L^) is the probability that a random point lies in L^, 
while Wq = /i(i^o) is the probability that a point lies somewhere on the spine. 

Convention 2.3. Throughout this paper, assume the nondegeneracy condition 
(2.4) Wk = fi{L+) > for all A: e {1, ... , K}. 

Otherwise, we would remove those leaves for which ^Ji{L'^) = from the open book. 
Nondegeneracy implies that Wq < 1 and < < 1 for all > 1 in the decomposition 
from Lemma 12. 1[ 



Definition 2.4 (Folding map). For k E {1, . 

sends p E O to 

X ii p = (x, k) e Lfc, 
Rx if p = (x, j) G Lj and j ^ k 

where the reflection operator R was defined in Definition II. 31 



K} the k folding map Fk : O ^ 



i>d+l 



FkP 



Remark 2.5. In the definition of the folding map F^, the leaf is identified with 
the subset C M'^"''^, by slight abuse of notation (again). The other leaves Lj are 
collapsed to the negative half-space C via the reflection map. All of these 
identifications have the same effect on the spine S*, which becomes the hyperplane H = 
{0} X M'^ c M'^+^. For example, F4 takes the picture in Example 11.21 to as follows. 




The notations and i/„ (with no bars) are reserved for the strictly positive and 
strictly negative open half-spaces that are the interiors of if+ and respectively. 

Lemma 2.6. Under the folding map F^, the measure fi pushes forward to a measure 
P'k = fJ'° FjT^ on W^^^ such that, given a Borel subset A C M"'"'"-'^, 



Proof. Lemma [2. 1[ 



□ 
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Definition 2.7 (First moment on a leaf). Let x^^\ . . . , x^*^-* be the coordinate functions 
on M'^"'"^. The first moment of the measure n on the k^^ leaf Lj. is the real number 



ruk 



Jo 



where ttq : W^^^ — )■ M is the orthogonal projection with kernel H = {0} x M'^. 

Remark 2.8. For any point p E O, the projection iToFkP is positive if p G and 
negative if p G for some j ^ k. Moreover, |7ro-pA;p| = \x^^^ \ is the distance of p from 
the spine. The integrability in Eq. ( 12. 2 p guarantees that the first moments of n are 
all finite. 

Theorem 2.9. Under integrability fl2.2p and nondegeneracy (12.40 . either 

1. TTT-j < /or all indices j G {!,..., K}, 

or there is exactly one index /c G {1, . . . , K} such that ruk > 0, in which case either 

2. mk > 0, or 

3. rrik = 0. 



Proof. For /c = 1, . . . , let 



Vk= I x^^^ dfik{x). 



The nondegeneracy fl2.4p implies that Vk> Observe that 



rrik = WkVk - y^Wjfj. 



i>i 



For any j ^ k E {1, . . .,K}, 

rrij = WjVj - ^ WiVi < WjVj - WkVk < w^v^ - WkVk = -rUk, 
e>i £>i 

since the weights wi are nonnegative. Therefore, if > for some k, then rrtj < 
—rrik < for all j ^ k. Also, if rrik = for some index k, then rrij < for all j ^ k. 

Now suppose there are two indices j,k G {1, . . . , K} such that j ^ k and rrij = 
and rrik = 0. Then 

= rrij = WjVj — WkVk — WiVi 

e>i 

and 

= mk = WkVk - WjVj - ^ WiVi. 

e>i 
e^j,k 
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Adding these two equalities results in 



= rrij + mi;. = —2 



WiVe. 



Since weve > 0, it follows that W£V£ = for all i ^ j, k. Consequently, ^{Lf) = for 
all i 7^ j, k. However, this contradicts nondegeneracy (12. 4p and the fact that K > 3. 
Hence at most one of the numbers rrik can be nonnegative. □ 

Motivated by Theorem 14.31 and Corollary 14.41 we use the following terms to describe 
the three mutually-exclusive conditions given in Theorem 12.91 

Definition 2.10. Under integrability (12. 2p and nondegeneracy (12.41) . we say that the 
mean of the measure /i is either 

1. sticky if rrij < for all indices j G {1, . . . , K}, or 

2. nonsticky if > for some (unique) A; G {1, . . . , K}, or 

3. partly sticky if rrik = for some (unique) G {1, . . . , K}. 

Remark 2.11. If square integrability (12. 3p also holds, the first moment may be 
identified with the partial derivative 



or. 



c(0)=0 



where : R'^+^ -> M is defined by 



Observe that —^^{x) depends on but not on (x*^^\ . . . ,x'^'^^). 

3. Sample means 

For any finite collection of points {pn}n=i *^ ^i Frechet mean is a natural 
generalization of the arithmetic mean in Euclidean space: 

Definition 3.1. The Frechet mean, or barycenter, of a set {pn}n=i C C of points is 

N 

b{pi, . . .,pn) = aYgmm(y^ d{p,PnY). 

71=1 

By Lemma [LH and |Stu03l Proposition 4.3], the barycenter b{pi, . . . ,pn) G O exists 
and is unique. 

Definition 3.2. For fixed k E {1, . . . , K}, the point i^k.N £ M"'+-^ defined by 

1 ^ 

(3.5) Tlk,N = J^^FkPn 

n=l 

is the fc*^ folded average: the barycenter of the pushforward under the fc*^ folding map. 
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For a set of points {pn}n=i C O, the condition b{pi, . . . ,pn) G Lq does not neces- 
sarily imply r/fc AT G H. Nevertheless, the following lemma establishes an important 
relationship between b{pi, . . . ,pn) and r]k,N- Specifically, taking barycenters commutes 
with the k^^ folding in two cases: if the barycenter lies off the spine in L^; or if the k^^ 
folded average lies in the closure of the positive half-space. 

Lemma 3.3. Let {pn}n=i C O and bjsf = b{pi, . . . ,Pn)- If bN G , then rjk,N G H+ 
and rik^N = FkbN- If 'nk,N e H+, then b^ e Lk and FkbN = r]k,N (i-e. bN = {r]k,N, k)). 

Proof. Let k,l ^ {1, . . . , K}. If p G L^, then d{p,Pn) = l(-ffcP) — (-ffcPn)l- Therefore, if 
bN & Lt then 

N N 

bN = aigmin^ d{p,pnf = argmin^lF^p- 

n=l n=l 

Since Fk is continuously bijective from Lk to this implies that the function 

TV 
n=l 

attains a local minimum in the open set However, this functional has only one 
local minimizer, which must be the unique global minimizer rjk^N'- 

N 

r]k,N = argmin |2 - 

Consequently, rjk^N ^ 11+ and hence FkbN = 'r]k,N- 

If bN ^ Lk, then bN G for some i k. Hence rji^N = FebN, as we have shown. In 
particular, rji^N ^ 11+ and 'noVi,N > 0. Hence 

(3.6) ^ T^oFePn > - T^oFiPn > - X] T^oFePn = 5Z '^oFkPn- 

Pn£L+ Pn^L+ Pn6Lfc PnSLfc 

Therefore, 

1 ^ 1 1 

T^OVk,N = J^^TToFkPn < ^ T^oFkPn + ^ TToFfcP^ 

Because of Eq. (13. 6p . this last expression is negative. Hence, we have shown that 
bN ^ Lk implies rjk^N G H^. Therefore, if rjk^N G it must be that bN G Lk. 
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Consequently, as above, 



N 



bN = argmin^ci(p,p„)^ 



n=l 

N 

argmin ^ \FkP - FkPn\ 



P-1 



n=l 

N 



are 



Note that Fi^'^r]k,N is well-defined, since r]k,N £ □ 

Definition 3.4. Given a point p = (x, j) = {x^^\x^-^\ . . . ,x^^\j) E O, 

Psp= {x^'\...,x^'^^) e S 

is the orthogonal projection of p onto the spine S. 

The following lemma shows that taking barycenters commutes with projection to 
the spine. 



Lemma 3.5. If {pn}n=i C O and 

N 

1 . 

Vn 



1 ^ 



N 

n=l 

then yN = PsKpi^ ■ ■ ■ ^Pn)- 

Proof. Let tts '■ ^'^^^ — M.'^ be the orthogonal projection onto the last d coordinates. 
Let bj^f = b{pi, . . . ,p^). If b^ G for some k, then rj^^N = -ffe&Af by Lemma [331 
Therefore, since PsP = T^sFkP for all p G O, 

1 ^ 1 ^ 

Ps^N = T^sFkbN = '^SVk,N = ^ X^^S^fcPn = j;^"^ Pn = VN ■ 

n=l n=l 

On the other hand, if G Lq then by definition of b^, 

N N 

biv = argmin = argmin^ (koPn^ + \p- PsPn]"^) ■ 

P^^O n=l n=l 

Therefore Ps bN = arg minj^g^d Y.n=i Iv - Ps Pn]"^ = Y.n=i ^sPn = yN,as desired. □ 
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4. Random sampling and the Law of Large Numbers 

We now consider points {pn}n=i sampled independently at random from a Borel 
probability measure fi on O; we wish to understand the statistical behavior of their 
barycenter for large A^. More precisely, let {Q, J-", P) be a probability space, and for 
each integer n > 1 let Pn{'^) : — > O for fixed w G be a random point in O. Assume 
for all n > 1 that pi, . . . ,Pn are independent random variables and that for any Borel 
set A CO, 

P(p„ eA)= F{{u G Q I pnioo) G A}) = fi{A). 

The sample space fl may be constructed as the set of infinite sequences (^1,^2,^37 • • •) 
of points in O endowed with the product measure P = H^i f^iPn) on the cr-algebra 
generated by cylinder sets. Observe that the folded points {FkPn{u:)}^^i C R'^^^ are 
independent, each distributed according to flk- 

Definition 4.1. For any positive integer A^, let b]\f{uj) = b{pi, . . . ,pi\f) denote the 
barycenter of the random sample {pi{u), . . . ,pi\f{u)}. This random point in O is 
the empirical mean of the distribution /i. Similarly, for k G {1,...,K}, the ran- 
dom point ?7A;,Ar(a;) G M'^"'"^ denotes the k^^ folded average of the random sample 
{pi{uj), . . . ,pn{uj)}, as defined by (13.51) . 

The goal is to understand the statistical behavior of empirical means as A^ — )■ 00. 
A consequence of the square integrability assumption 02.31) is that the limit b in the 
next result is the mean of a random point on O having probability measure /x |Fre48j : 

6 = argmin / d{p,qf d^{q). 
peo Jo 

Lemma 4.2 (Strong Law of Large Numbers). There is a unique point b & O such that 

lim bNiuj) = b 

holds F-almost surely. The point b is the Frechet mean (or barycenterj of fi. 

Proof. This is a special case of |Stu03t Proposition 6.6], whose generality occurs in the 
context of distributions on globally nonpositively curved spaces. (An elementary proof 
from scratch is also possible, using arguments similar to the proof of Theorem 14.31 In 
general on metric spaces, there can be more than one Frechet mean, and there are 
corresponding set- valued strong laws |Zie77l [BP03j .) □ 

Theorem 4.3 (Sticky LLN). Assume integrability (12.21) and nondegeneracy (12.41) . 

1. // the moment rrij satisfies rrij < 0, then there is a random integer N*{uj) such 
that bN{oj) ^ L'j for all N > N*{uj) holds F-almost surely. Furthermore, b ^ L^. 

2. If the moment rrik satisfies > 0, then there is a random integer N*{uj) such 
that bi\f{u) G for all N > N*{uj) holds F-almost surely. Furthermore, b G L^. 

3. // the moment satisfies = 0, then there is a random integer N*{uj) such 
that bN{u}) G Lk for all N > N*{u) holds F-almost surely. Furthermore, b E Lq. 
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Proof. By the usual strong law of large numbers, 

lim r]k,N = r]k= 1 xdflk{x) 

holds P- almost surely. Observe that rrik = T^ofjk- Therefore, if irtk > 0, f]k & and 
rjk^N G for all sufficiently large A^. In that case, Bn ^ for all sufficiently large 
by Lemma |3.3[ In fact, vTo^Af = T^oVk.N > ""^fe/S > for sufficiently large, so by 
virtue of Lemma [4.21 b G L^. The same argument starting with rrik > proves the 
case rrik = 0. On the other hand, if mj < 0, then rjj^N G for all sufficiently large A^; 
Lemma [3.31 implies that ^ Lj' for all sufficiently large A^, and b ^ L^. □ 

As a consequence, to say that the mean of /x is sticky implies that the empirical 
mean b^ sticks to the spine Lq G O for all sufficiently large A^, in the following sense. 

Corollary 4.4. If the mean of fi is sticky, then there is a random integer N*{uj) such 
that bi\f{uj) G Lq for all N > N*{uj) holds F-almost surely. Moreover, b ^ Lq. If the 
mean of fi is partly sticky, with ruk = 0, then then there is a random integer N*{uj) 
such that b^lu) G Lk for all N > N*{uj) holds F-almost surely. Moreover, b G Lq. 

Recall that Ps is the orthogonal projection onto the spine 5*. The measure /i pushes 
forward along the projection to a measure /i5 = yU o P^^ on S: 

^,siA) = fiiPs'A) 

for any Borel set A C R'^. Note that /io(^) < fJ's{A) for all Borel sets ACS, but 
7^ fJ'O by Convention 12.31 

Corollary 4.5. In all cases (sticky, nonsticky, partly sticky), the limit b E O satisfies 

(4.7) Psb= [ ydi^siy). 

Js 

Proof. By Lemma 13.51 and Theorem 14.31 

Psb = Ps lim b^ = lim y^ 

holds almost surely. By the strong law of large numbers for y^ E S 
limit is (HTj) . 

5. Central Limit Theorems 

In this section we consider fluctuations of the empirical mean &Ar(u;) about the as- 
ymptotic limit b, within the tangent cone at b. We have shown that if the mean is 
either sticky or partly sticky, then 6 G S", and the tangent cone at b is an open book O. 
On the other hand, if the mean is nonsticky, with rrik > 0, then b is in the interior of 
the leaf and the tangent cone at b is the vector space R'^^-'^. We treat these two 
scenarios separately. 



= M*^, the last 
□ 
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These facts essentially follow Theorem 14 . 3 1 which shows that in the sticky cases with 
probability one the fluctuations away from the mean in certain directions stop as more 
random variables are added to the empirical mean. In particular, this implies that the 
correctly normalized limit of the fluctuation from the mean cannot in the sticky case 
converge to a Gaussian random variable as one would have in the standard central 
limit theorem. Since the fluctuations in some directions are exactly zero at some point 
along each sequence of random variables, it is not all together surprising that limiting 
measure has mass concentrated on a lower dimensional set. This is the content of 
Theorem 15.71 which is the principal result of this section. 

5.1. The sticky Central Limit Theorem. Throughout this section, assume that 
the mean is either sticky (with first moments rrij < for all j) or partly sticky (with 
rrtfc = 0); that is, nij < for all k j E {I, . . . , K} and < 0. Hence b e Lq. The 
central limit theorem involves a centered and rescaled empirical mean. 

Definition 5.1 (Rescaled empirical mean). Assume that Psb = (after the action of 
—Psb E S on O as explained in Remark 1 1.6 1 if necessary). The rescaled empirical mean 
is the random variable y/Nb^ G O. Write for its induced probability law on O: 



for all Borel sets ACQ. 

Since in sticky settings, we need to collapse fluctuations in some directions back to 
the spine, it is convenient to define the following projection. 

Definition 5.2. The convex projection P of M'^'^^ onto iJ+ is 



We now define measures which we will see shortly describe the limiting behaviors of 
un as N ^ oo. In short, they are the limiting measures in the central limit theorem 
given in Theorem 15.71 below. 

Definition 5.3. Assume square integrability (12.31) and assume that Psb = 0- 

1. The spinal limit measure gs is the law of a multivariate normal random variable 
on the spine S = M.'^ with mean zero and covariance matrix 






2 



The k^^ costaB limit measure gk is the law of a multivariate normal random 
variable on W^'^^ with mean zero and covariance matrix 




1 



adjective: of or pertaining to the ribs, in anatomy 
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3. The k^^ spinocostal^ limit measure on the closed leaf = if + is defined by 

hk{A) = hl{Fk{A)nH)+gk{Fk{A)nH+) 

for Borel sets A C L^, where the semispinal limit measure h\ on Lq is defined by 

hl{{Ps\ur'B) = gs{B) - (7fc((0, oo) x B) 

for Borel sets B C S. (A possibly more natural definition of hk is given in 
Proposition 15.61 below.) 

Remark 5.4. Square integrability (12. 3 p implies that the covariance matrices are finite. 

Remark 5.5. The semispinal limit measure is generally not Gaussian. Although the 
orthogonal projection to of any Gaussian measure on M°'+^ is Gaussian, is the 
projection of only half of a Gaussian; this is implied by Proposition 15. 6^ an alternate 
direct description of interpolating between the first two parts of Definition 15. 3[ 

Proposition 5.6. The spinocostal limit measure is the pushforward of the costal limit 
measure Qk under convex projection: hk = Qk ° ° F^. 

Proof. Since the measures agree on Lk outside of Lq by definition, it is enough to show 
that 

(5.8) hl{{Ps\uy'B) = gk{P-' o {7rs\H)-'B) 

for any Borel set B C S. For any vectors w, w' G R"'"'"-'^ that lie on the spine H C R'^+-^, 
considering them as vectors in z = 7is{w), z' = 7rs{w') E S = M!^ results in a quantity 
z'^Csz', and w'^Ckw'. The integrals in Definition 15.31 directly imply that z^Csz' = 
w'^Ckw'. Consequently, the matrix Cs is a submatrix of Ck] the action of Ck on the 
subspace H is given by Cs- Thus gs{B) = gk[{—oo, oo) x i?), and hence by definition 

hUB) = gk ((-oo, oo) X B) -gk{iO, oo)xB) = ((-oo, 0]xB)= gk {p-'o{ns\Hy'B) , 
for any Borel set i? C S*. □ 

Now we come to the primary result in the paper: as the sample size becomes 
large, the law uj^ of the rescaled empirical mean converges weakly to the appropriate 
measure from Definition 15. 3[ according to how sticky the mean is. When the mean is 

1. sticky, converges weakly to the spinal limit measure gs- 

2. partly sticky, converges weakly to the spinocostal limit measure gj supported 
on the (unique) leaf Lk with moment = 0. 

As discussed at the start of the section, the fact that the limiting distribution is sup- 
ported on the spine S when the mean is sticky follows from Theorem 14. 3^ since then 
the first moments rrij are strictly negative for all j. 

^adjective: spanning the ribs and spine, in anatomy 
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Theorem 5.7 (Sticky CLT). Let fi be a nondegenerate fl2.4p probability distribution 
on the open book O with finite second moment f l2.3p . 

1. If the mean of n is sticky, then for any continuous, bounded function : C — )■ M, 

lim / (l){p) du^ip) = / ^o{Ps\Lo)~^iQ)dgs{q)- 
^^°°Jo Js 

2. If the mean of fi is nonsticky, then see Theorem \5.11[ 

3. // the mean of fi is partly sticky, with first moment ruk = 0, then for any contin- 
uous bounded function </> : (9 — )■ M, 



lim / <j){p)duM{p) = / (po Fj^^{q)dhk{q). 
Jo Jh+ 



Af->oo 

Proof. The proof works by decomposing the relevant measures — the empirical mean on 
the open book and its pushforward to M'^+^ under folding — into pieces corresponding 
to the leaves and the spine. 

Suppose that the mean is partly sticky with first moment = 0. Let rji^ = r]k,N as 
in f lS.Sp . and let Vn,N{x) denote the law of VNr]N on M'^+^. By Lemma |33| i^Ni^) = 
^ri,N{,FkA) for any Borel set A <Z Lk, and if is a continuous and bounded function, 
then 

(pij)) duNij)) = / 4>{p)duN{p)+ / (f){p) duN{p) 



F^'\H+y\y)) dVr^Av) + / dyN{p). 

The standard CLT in W^^'^ (e.g. |Bre92t Thm. ILIO]) implies that the random variable 
y/NrjN converges in distribution to a centered Gaussian with covariance Ck- Therefore, 



lim/ <j){{F-'\H^)-\y)) du.^y) = [ ct>[{F-'\H^r\y)) dg^- 
Jh+ Jh+ 

Lemma 5.8. // the j^^ first moment satisfies mj < 0, then vn^L^) and 

lim / (j){p) di'isf{p) = 0. 

Proof. Theorem 14.31 1. □ 
Resuming the proof of the theorem, consider the term 



(j){p) duNip) = / 0(p)c?Z/7v(p) + / (t){p)duN{p), 
0\L+ JLo J L- 
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where L'f^ = O \ Lk = [jj^k^t' '^^ich excludes the spine Lq. With the projection 
Pq : O Lq, {x^^\x,j) (0,x,j) the function p i— )■ (piPop) is again continuous and 
bounded, Lemma [5.81 imphes that 

(5.9) hm / <j){Pop) duN{p) = 0. 

N^oo 



LI 



Therefore, 



hm / (j){p)duN{p) = hm / (j){Pop) du^ip) 



lim / (f){Pop) du^ip) + / 4>{Pop) duN^p) 



L, J Lo 



hm ( / (l){Pop) duNip) - I (P^Pqp) duN^p) 

Observe that 



O 



Pop)di^N{p)= / <j)o{Ps\Lo) \y)d-fNiy), 



where 'Jn = vn ° Pg^ which is the law of VNi/n on S, where i/n is the projected 
barycenter from Lemma 13.51 Therefore, setting = 0o {Ps\lo)^^ and applying the 
usual CLT to VNvn G M"*, 



lim / (l){Pop) duN{p) = Jiva. / (j){y) d-fNiv) = / <i){y)dgs{y). 
We cannot, apply the same argument to 



lim / (f){Pop)duN{p) = / (l){y)dTN{y) 

AT— >oo J N^oo J 

k k 

with tn = o (-Ps'Il+)~^ because there is no CLT for tn- We have, however, above 
derived a CLT for z/jy o = z/^ on = Fk{L^): 



lim / (f){Pop) duN{p) = \im / 0(g) rfz/^,Ar(g) = lim / <p{q)dgk{q), 

N^coJ^+ ^-*^Jh+ ^-*°°Jh+ 

where (p = (p o Pq o o P^^ ■ In summary, we have shown that 

lim / <p{p)duN{p) = / (p ° Fk\q)dgkiq) + / (piy) dgsiy) - / ^(q) dgkiq) 
^^^Jo Jh+ Js Jh+ 



(p o 

H+ 

(p o 



Fk\q)dgk{q)+ I <P o F~\q) dhl{q) 

JH 

F~\q) dhkiq), 
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where the second equahty uses the fact that = o on H and the final equahty 
the fact that has no mass supported on the spine H, so the integral of o dgk 
over can just as well be taken over H^. 

The sticky case proceeds in much the same way as the partly sticky case does, except 
that instead of Eq. (15.91) . the simpler statement 



holds. From that, the next step results in 



lim / (j){Pop) du^ip) = 



lim / (f){p)duN{p) = lim / (j){Pop) duN^p) 



d 



and then the usual CLT applied to vNy^ E M proves the desired result. □ 

5.2. The nonsticky Central Limit Theorem. If the mean is nonsticky with first 
moment rrik > 0, then the limit b is in the interior of L^. In this case, the tangent 
cone at b is the vector space M'^+^, and the fiuctuations of 6 at about the limit b are 
qualitatively similar to what is described in the classical central limit theorem. 

Definition 5.9. In this section we let be the law on M'^+^ of the random variable 
VN{F,bN - Fkb): 

P ({u; I ^{Fkb^ - Fkb) e A}) = u^iA) 
for all Borel sets A C M.<^+\ 

Definition 5.10. Assume > 0. Let gk be the law of a multivariate normal random 
variable on M'^^^ with mean zero and covariance matrix 



Ck= [ {x-Fj>){x-Fj>Yd~^u{x). 



Unlike the case of the sticky and partly sticky mean, the weak limit of is that of 
a nondegenerate gaussian on M'^"'"^: 

Theorem 5.11 (Nonsticky CLT). Assume ruk > 0. Then for any continuous 
bounded function : M'^^^ — )■ M, 



lim / (f){x) duN{x) = / (f){x) dgkix). 

N^oo J^d+1 jRd+i 

Proof. Since ruk > 0, b E and Lemma [331 implies Fj) = fj = f^a+i xdjlk{x). Also, 

^{FubN{uj) - Fkb) = VN{7]k,N{^) - r/), V > N*{io) 
holds with probability one. Therefore, for any Borel set 



un{A) - P ({w I y/N{r]k,N{oo) - v) E A} 



< R 



N 
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where Rn = P {{uj \ N < N*{u)}). By the classical central limit theorem, the random 
variable \/N {r)k,Ni^) —v) converges in law to a centered, multivariate gaussian on M''"'"^ 
with CO variance Ck a.s N oo. Consequently, 

< 21imsupi?jv||0||oo = 0. □ 

N->-oo 
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